Convergence of Large Margin Separable Linear Classification
نویسنده
چکیده
Large margin linear classification methods have been successfully applied to many applications. For a linearly separable problem, it is known that under appropriate assumptions, the expected misclassification error of the computed “optimal hyperplane” approaches zero at a rate proportional to the inverse training sample size. This rate is usually characterized by the margin and the maximum norm of the input data. In this paper, we argue that another quantity, namely the robustness of the input data distribution, also plays an important role in characterizing the convergence behavior of expected misclassification error. Based on this concept of robustness, we show that for a large margin separable linear classification problem, the expected misclassification error may converge exponentially in the number of training sample size.
منابع مشابه
SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming
Support vector machine soft margin classifiers are important learning algorithms for classification problems. They can be stated as convex optimization problems and are suitable for a large data setting. Linear programming SVM classifier is specially efficient for very large size samples. But little is known about its convergence, compared with the well understood quadratic programming SVM clas...
متن کاملA Linearly Convergent Linear-Time First-Order Algorithm for Support Vector Classification with a Core Set Result
We present a simple, first-order approximation algorithm for the support vector classification problem. Given a pair of linearly separable data sets and ∈ (0, 1), the proposed algorithm computes a separating hyperplane whose margin is within a factor of (1 − ) of that of the maximum-margin separating hyperplane. We discuss how our algorithm can be extended to nonlinearly separable and inseparab...
متن کاملLarge Margin Kernel Pocket Algorithm
Two attractive advantages of SVM are the ideas of kernels and of large margin. As a linear learning machine, the original pocket algorithm can handle both linearly and nonlinearly separable problems. In order to improve its classification ability and control its generalization, we generalize the original pocket algorithm by using kernels and adding a margin criterion, and propose its kernel and...
متن کاملConvergence of Gradient Descent on Separable Data
The implicit bias of gradient descent is not fully understood even in simple linear classification tasks (e.g., logistic regression). Soudry et al. (2018) studied this bias on separable data, where there are multiple solutions that correctly classify the data. It was found that, when optimizing monotonically decreasing loss functions with exponential tails using gradient descent, the linear cla...
متن کاملHo-Kashyap with Early Stopping Versus Soft Margin SVM for Linear Classifiers - An Application
In a classification problem, hard margin SVMs tend to minimize the generalization error by maximizing the margin. Regularization is obtained with soft margin SVMs which improve performances by relaxing the constraints on the margin maximization. This article shows that comparable performances can be obtained in the linearly separable case with the Ho–Kashyap learning rule associated to early st...
متن کامل